Guardrail Resource Management
Within DynamoGuard, each policy is associated with a specific model that runs on a GPU. To efficiently manage compute resources for input content policies, you have the ability to scale up (deploy) or scale down (deactivate) policies. When a policy is spun down, it cannot be used via the /analyze/ or /chat/ endpoints. This can be helpful when a policy is being created and testing but is not yet integrated into an application.
Scaling Up and Down Policies via UI
To manage the deployment state of policies, you can toggle the policy from the UI. For content policies where traning is complete, selecting More Options will show a button to either Scale Up Policy or Scale Down Policy based on the policy's current state.
Scaling Up and Down Policies via API
To manage the deployment state of policies, you send a PUT request to the Policy API Endpoint using the isEnabled parameter in the request body. To scale up a policy, set isEnabled to true, and to scale down a policy, set isEnabled to false. Below is an example of scaling down a policy:
url = "{YOUR_DYNAMOAI-URL}/moderation/policy/{POLICY_ID}"
headers = {
"Authorization": "Bearer {YOUR_DYNAMOAI-TOKEN}",
"Content-Type": "application/json"
}
data = {
"isEnabled": False
}
response = requests.put(url, headers=headers, json=data)
Scaling Up and Down - Policy Statuses
After scaling policies up or down, you will observe the following status changes.These states help you track the readiness of your model-backed policies for inference tasks.:
- Scaling Up: Policies will transition from Not Deployed to Scaling Up to Deployed
- Scaling Down: Policies will transition from Deployed to Scaling Down to Not Deployed